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Abstract 


Virtual  sound  source  motion  has  been  implemented  in  the  U.S.  Army  Researeh  Laboratory’s 
Environment  for  Auditory  Researeh,  whieh  eontains  a  57-ehannel  spherieal  loudspeaker  array 
loeated  in  a  semi-aneehoie  ehamber.  Using  the  low-lateney  PortAudio  application  programming 
interface  from  the  Psychophysics  Toolbox  Version  3,  we  are  able  to  dynamically  update  57 
ehannels  of  streaming  audio  in  real  time  using  MATLAB  for  signal  proeessing.  Both  Distance- 
Based  Amplitude  Panning  (DBAP)  and  Veetor  Base  Amplitude  Panning  (VBAP)  have  been 
implemented  in  MATLAB  for  controlling  source  motion.  Sourees  are  defined  on  a  given  path, 
sueh  as  a  eirele,  ellipse,  or  the  “dog  bone”  pattern  often  used  in  aviation.  Although  DBAP  works 
convineingly  for  virtual  sourees  located  on  the  sphere  defined  by  the  loudspeaker  array,  VBAP  is 
needed  to  position  sources  outside  the  array.  Source  motion  paths  are  defined  parametrically  with 
respect  to  time,  and  the  playback  buffer  updates  the  panned  position  every  11.5  ms.  Based  on  the 
source’s  instantaneous  distanee,  diffuse-field  or  free-field  amplitude  attenuation  is  added  in 
MATLAB,  as  is  air  absorption  filtering.  This  virtual  sound  source  method  will  be  used  for  a 
variety  of  audio  simulations  and  auditory  experiments. 
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1.  Introduction 


The  Environment  for  Auditory  Researeh  is  a  state-of-the-art  facility  designed  for  a  wide  range  of 
basic  and  applied  auditory  research  (7).  Its  multiple  loudspeaker  configurations  can  create 
immersive  audio  simulations  designed  to  test  human  auditory  perception,  detection  thresholds, 
and  angular  and  distance  localization.  One  of  its  three  main  reproduction  environments,  the 
Sphere  Room,  contains  a  57-chaimel  spherical  loudspeaker  array,  allowing  a  dense  reproduction 
field  along  both  azimuth  and  elevation  dimensions. 

The  Sphere  Room  allows  for  easy  reproduction  of  static  soundscape  simulations  through  a 
simple  multichannel  audio  editor,  such  as  Adobe  Audition.  However,  the  system  must  reproduce 
both  static  and  moving  sources  to  convincingly  simulate  real-world  environments.  Rendering  a 
moving  source  via  individual  pans  between  57  chaimels  becomes  quite  tedious  in  a  software 
environment  designed  for  static  sources.  In  Chowning’s  pioneering  work  on  source  motion,  a 
signal-processing  system  was  designed  to  precompute  synthesized  signals  and  change  amplitudes 
on  a  quadraphonic  reproduction  system  (2).  This  method  has  been  adapted  in  the  Sphere  Room 
to  precompute  channel  gains  over  time  for  a  given  source  path  and  panning  algorithm  with 
recorded  rather  than  synthesized  source  signals.  The  array  of  chaimel  gains  is  then  used  to  move 
the  source  by  updating  each  loudspeaker’s  gain  with  real-time  streaming  audio.  While  this 
system  was  developed  for  experiments  in  the  Sphere  Room,  it  can  also  simulate  source  motion  in 
less  dense  audio  reproduction  facilities. 


2.  Streaming  Audio  in  MATLAB 


MATLAB  was  used  to  implement  the  source  motion  system,  which  allowed  low-level  control  of 
panning  and  signal  processing.  MATLAB  interacts  with  the  Sphere  Room’s  RME  Hammerfall 
DSP  audio  interface  via  the  PortAudio  application  programming  interface  available  through 
MATLAB’s  Psychophysics  Toolbox  Version  3  (PTB-3)  extensions  (3).  PortAudio  allows  high- 
fidelity,  low-latency  audio  control  using  static  or  streaming  buffers.  The  streaming  algorithm 
works  by  initiating  an  audio  buffer  and  then  continually  appending  small  amounts  of  audio  data 
onto  the  end  of  the  buffer  while  playback  is  active. 

The  creators  of  PTB-3  recommend  that  each  append  to  the  buffer  be  no  longer  than  half  the 
latency  of  the  audio  system.  Although  lower  latency  allows  the  system  to  handle  more  iterations 
of  the  streaming  loop  in  a  given  amount  of  time,  the  latency  must  allow  time  to  process  all  real¬ 
time  signal  operations  or  the  buffer  will  underflow.  A  series  of  tests  using  white  noise  and 
sinusoidal  signals  showed  that  the  Hammerfall ’s  internal  buffer  size  should  be  set  to  1024 
samples,  which  corresponds  to  a  system  latency  of  23  ms.  This  allows  the  streaming  algorithm  to 
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update  every  1 1.5  ms  without  underflowing.  This  update  speed  is  sufficient  to  provide 
perceptually  smooth  panning  paths  across  hundreds  of  individual  positions.  Multiple  sources  can 
be  processed  simultaneously,  but  the  buffer  size  must  be  increased  proportionally  to  handle  real¬ 
time  operations  for  each  source. 


3.  Panning  Algorithms 


The  Spatial  Audio  MATLAB  Toolbox  (4)  was  used  for  panning  the  virtual  sources  within  the 
Sphere  Room.  This  toolbox  contains  functions  for  both  Distance-Based  Amplitude  Panning 
(DBAP,  also  called  Vector  Distance  Panning  in  the  toolbox)  and  Vector  Base  Amplitude 
Panning  (VBAP).  For  both  panning  algorithms,  the  source’s  motion  path  was  defined 
parametrically  with  respect  to  time.  Test  paths  included  a  circle,  ellipse,  and  the  “dogbone”  flight 
pattern  often  used  in  aviation — ^two  long  parallel  lines  with  a  larger  curve  at  the  end  where  the 
pilot  turns  around  to  come  back  along  the  other  side,  as  shown  in  figure  1 . 


Figure  1.  Plot  of  a  symmetrical  “dogbone”  flight  pattern. 

DBAP  pans  a  sound  by  simultaneously  manipulating  the  gains  of  all  of  a  system’s  loudspeakers 
based  on  only  the  source’s  distance  to  that  loudspeaker  and  disregarding  listener  position  (5). 
DBAP  is  relatively  new  but  has  performed  well  in  subjective  tests  against  more  standard 
spatialization  techniques,  such  as  VBAP  and  Ambisonics  (6).  Because  DBAP  works  for  any 
number  of  loudspeakers  at  a  single  moment,  it  was  the  first  panning  algorithm  implemented. 
DBAP  panning  gives  a  convincing  impression  of  virtual  sources  along  the  sphere  represented  by 
the  loudspeaker  array.  Source  motion  implemented  with  DBAP  is  smooth  with  a  widened  source 
impression  because  of  the  additional  loudspeakers  used  to  produce  each  virtual  source.  However, 
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the  drawback  of  DBAP  is  that  it  cannot  accurately  represent  positions  outside  the  loudspeaker 
array,  since  the  difference  between  the  loudspeakers’  individual  distances  to  a  faraway  source 
becomes  negligible.  While  the  creators  of  DBAP  suggest  other  workarounds  for  such  situations, 
such  as  projecting  a  distant  source  onto  the  loudspeaker  array  and  then  applying  attenuation 
effects,  it  was  decided  to  use  only  DBAP  for  virtual  sources  within  the  bounds  of  the 
loudspeaker  array  itself 

VBAP  (7)  is  a  robustly  developed  panning  algorithm  based  on  the  following  principle;  given  a 
set  of  three  linearly  independent  vectors  a,  b,  and  c  arranged  in  a  triangle,  any  fourth  vector  d, 
the  desired  source  location,  within  that  triangle  may  be  created  by  a  linear  combination  of  the 
initial  three  vectors; 

l-^ci  +  /2b  "f  —  d  .  (1) 

If  the  first  three  vectors  represent  the  positions  of  three  loudspeakers  arranged  in  a  triangle,  then 
a  virtual  source  may  be  placed  within  that  triangle  by  scaling  and  adding  the  first  three  vectors 
together.  The  coefficients  of  each  vector  thus  become  gain  factors  for  each  of  the  three 
loudspeakers  in  the  triangle.  While  the  Spatial  Audio  MATLAB  Toolbox  contained  a  function 
for  calculating  the  VBAP  gains  for  any  given  triangle  of  loudspeaker  vectors,  unconstrained 
source  motion  would  require  the  system  to  automatically  find  which  triangle  the  source  vector 
intersected  at  any  given  point  in  time. 

Sunday’s  algorithm  (S)  was  implemented  to  calculate  the  vector-triangle  intersections.  The  57 
loudspeakers  in  the  Sphere  Room  define  110  distinct  triangles,  using  the  smallest  angles  possible 
in  each  case.  These  triangles’  position  vectors  each  define  a  unique  plane,  and  the  intersection 
algorithm  first  determines  whether  the  source  vector  intersects  that  triangle’s  plane.  If  it 
intersects,  the  algorithm  finds  the  plane’s  parametric  coordinates  s  and  t  that  determine  the  point 
of  intersection.  If  the  sum  of  s  and  t  is  between  0  and  I ,  then  the  vector  intersects  the  given 
triangle.  Finding  these  parameters  requires  only  five  distinct  dot-product  calculations,  and  the 
normal  vectors  of  each  plane  may  be  precomputed  and  stored  to  optimize  computation  for  static 
loudspeaker  arrays.  Because  of  this  optimization,  this  algorithm  is  extremely  efficient  for  VBAP 
systems  whose  loudspeaker  triangles  usually  do  not  change  position.  The  system  then  returns  the 
current  loudspeaker  triangle  based  on  the  source’s  position  and  uses  the  Spatial  Audio  MATLAB 
Toolbox’s  VBAP  function  to  calculate  the  appropriate  gains  within  the  loudspeaker  triangle. 
These  gains  are  then  applied  to  the  current  point  in  the  output  signal  and  appended  to  the 
corresponding  channels  of  the  streaming  57-channel  array  via  PortAudio. 


4.  Signal  Processing 


The  method  just  described  was  initially  intended  for  the  reproduction  of  high-quality  spatial 
recordings  of  aviation  vehicles  that  were  made  available  along  with  detailed  x-y-z  position  data 
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over  time.  When  these  data  are  used,  reproduction  may  be  perfectly  adapted  to  a  given  recording, 
calculating  loudspeaker  gains  in  response  to  the  recorded  source’s  interpolated  movement  over 
time.  This  method  maintains  physical  accuracy  because  distance-based  attenuation,  low-pass 
filtering  due  to  air  absorption,  and  Doppler  effects  are  all  contained  within  the  original  recording. 

To  make  the  system  more  flexible,  however,  we  have  begun  to  add  additional  signal  processing 
capabilities  to  allow  movement  of  an  arbitrary  signal.  The  current  system  has  options  of 
including  free-field  (inverse  square)  or  diffuse-field  (inverse  alone)  amplitude  attenuation.  This 
attenuation  is  added  in  real  time  to  the  streaming  audio,  although  the  attenuation  coefficients  are 
precomputed  for  each  panning  point  once  the  source’s  motion  path  is  defined.  For  air  absorption 
filtering,  following  the  suggestion  of  Huopaniemi  (9),  the  yulewalk  function  from  the  MATLAB 
Signal  Processing  Toolbox  {10)  was  implemented  to  reverse-engineer  a  one-pole  filter  based  on 
known  frequency-based  attenuation  data.  Using  the  ANSI  standard  {11),  we  altered  the  system  to 
find  attenuation  by  octave  bands  as  a  function  of  distance  and  humidity,  interpolating  between 
known  values  if  necessary.  The  filter  coefficients  are  precomputed  for  the  source  path,  but  the 
filter  can  be  applied  to  each  1 1 .5  ms  “chunk”  of  audio  in  real  time  without  causing  an  underflow. 


5.  Discussion 


The  method  outlined  in  the  previous  section  incorporates  streaming  audio  to  allow  low-latency 
virtual  source  motion  in  a  large  spherical  loudspeaker  array.  DBAP  and  VBAP  are  both 
implemented,  although  VBAP  is  only  needed  for  large-scale  simulations.  This  method  can  be 
used  with  high-quality  recordings  to  reproduce  the  exact  path  taken  by  an  object  recorded  in 
motion.  Amplitude  attenuation  and  air  absorption  filtering  have  been  added  to  allow  flexible 
control  of  arbitrary  source  signals. 

The  drawback  of  this  approach  is  that  it  makes  sense  for  only  signals  whose  audio  content  does 
not  change  based  on  their  movement.  While  this  might  be  the  case  for  a  person  talking  while 
walking  at  a  low  speed  or  even  a  white  noise  burst,  which  is  fairly  abstract  in  the  first  place,  it 
does  not  really  make  sense  for  recordings  of  vehicles,  whose  engines’  outputs  vary  greatly  with 
velocity,  not  to  mention  the  difficulty  of  modeling  sound  radiation  patterns  and  source 
orientations.  It  is  also  difficult  to  obtain  high-quality  recordings  of  vehicles  moving  at  a  constant 
velocity,  since  the  microphone  is  usually  contained  within  the  vehicle  rather  than  outside  it.  An 
outside  recording  of  a  vehicle’s  engine  idling  may  be  used  with  this  method  to  obtain  variable 
results,  but  such  a  simulation  will  not  sound  convincing  to  an  experienced  listener.  The  inclusion 
of  a  flexible  Doppler  shift  based  on  the  source’s  relative  velocity  will  be  a  future  goal,  although 
as  mentioned  previously,  the  sources  that  move  fast  enough  to  create  a  significant  Doppler  shift 
would  usually  be  used  in  functions  tailored  specifically  to  the  motion  path  of  the  recorded 
source.  For  slower-moving  objects,  the  Doppler  shift  would  have  a  negligible  effect  on  the 
perception  of  the  moving  sound  source. 
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